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1 Introduction 

Standard micro-economics concentrate on the description of markets but is 
seldom interested in production. Several economists discussed the concept 
of a firm, as opposed to an open labour market where entrepreneurs would 
recrute workers on the occasion of each business opportunity. Coase [T] is 
one of them, who explains the existence of firms as institution because they 
reduce the transaction costs with respect to an open labour market. 

Whatever the rationale proposed by economists to account for the ex- 
istence of firms, their perspective is based on efficiency and cost analysis. 
Little attention is paid to the dynamics of emergence and evolution of firms. 

The aim of the present manuscript is to check the global dynamical prop- 
erties of a very simple model based on bounded rationality and reinforcement 
learning. Workers and managers are localised on a lattice and they choose 
collaborators on the basis of the success of previous work relations. The 
choice algorithm is largely inspired from the observation and modeling of 
long term customer /sellers relationships observed on perishable goods mar- 
kets discussed in Weisbuch etal|2j and Nadal etal[3]. 

The model presented here is in no way an alternative to Coase. We 
describe the build-up of long term relationships which do reduce transaction 
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costs, and we deduce the dynamical properties of networks built from our 
simple assumptions. 

2 The model 

2.1 The model 

Let us imagine a production network of workers: we use the simplest structure 
of a lattice: at each node is localised a "worker" with a given production ca- 
pacity of 1. Business opportunities of size Q randomly strike "entrepreneur" 
sites at the surface of the lattice. 

The work load received by the entrepreneur is too large to be carried out 
by her: she then then distributes it randomly to her neighbours upstream; 
let us say that these neighbours are her nearest neighbours upstream. We 
postulate two mechanisms here: a probabilistic choice process according to 
preferences to different neighbours, and the upgrading of preferences by as a 
function of previous gains. The probability of choosing neighbour j is given 
by the logit function: 

nh 

Pj = exp(/3J,)/ cxp(/3Jfe) (1) 

fe=i 

where the sum extends to all neighbours of the node. The preferences Jj 
are updated at each time step according to: 

J,(i) = (l-7)J,(i-l) + g,W (2) 
where qj{f^ is the work load attributed to node j. 

One time step corresponds to the distribution of work loads across the 
set of collaborators of the entrepreneur who received the work load. 

A series of work loads strike the entrepreneur at successive time steps. We 
want to characterise the asymptotic structure generated by a large number 
of work loads presented in succession to the entrepreneur. 

2.2 The algorithm 

Workers are placed on a (d + 1) -dimensional hyper-cubic lattice of height 
Lz- Each hyper-plane line = 1,2, . . . , is a lattice of linear dimension L 
with L'^ sites and periodic (helical) boundary conditions. A workload Q is 
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distributed from the top level (hypcr-plane) line = 1 upstream, in steps from 
level line to level line + 1, until Q different workers (sites) i each have a local 
workload qi = 1. All local workloads g.j are integers 1,2, ... ,Q. 

One iteration corresponds to the downward distribution of one workload 
and proceeds as follows: Initially all sites have workload zero. A new work- 
load arrives at the central site of the top level. Thereafter, each site i on 
level line having a local workload qi > 1 distributes the surplus — 1 to 
its nnb = 2d -\- 1 nearest and next-nearest neighbours j on the lower level 
line + 1, in unit packets qi ^ qi — 1. For this purpose it selects, again and 
again, randomly one such neighbour j and transfers to it with probability p, 
given by equation (1) one unit of workload, increasing by one unit the pref- 
erence Jij storing the history of work relations. After site i has distributed 
its workload in this way to the lower level of hierarchy and has only a re- 
maining unit workload, the algorithm moves to the next site having a local 
workload bigger than unity. The whole iteration stops when the lowest level 
line — Lz is reached or when no site has a local workload above unity. Then 
all workloads are set back to zero, all stored preferences J are diminished 
by a factor 1 — 7, and a new iteration starts, influenced by the past history 
stored in the preferences Jij. 

3 Simulation results 

3.1 Equilibration 

Fig.l shows that for the chosen parameters some stationary equilibrium is 
obtained between the increase of sum of all Jij, called the flow, due to new 
work, and the decrease of the flow due to the forgetting parameter. The depth 
of the load pattern in the lattice also increases and finally reach saturation 
as the fiow. These dynamics are similar in lower (1+1) and higher (1+4) 
dimensions. 

3.2 Snakes and blobs 

According to (3, 7 and load values, after many iterations two dynamical 
regimes are observed: a quasi-deterministic regime such that only one link 
out of 2d + 1 is systematically chosen resulting in a " snake" portion of the 
work pattern, and a random regime where all 3 links are used, resulting in a 
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"blob" portion of the work pattern. The interface between the two regimes 
corresponds to 

- l)/7 = Constant (3) 

where q{z) — 1 is the work load distributed by a node of charge q{z) at depth 
z. Because the work load to be distributed, q{z) — l decreases with increasing 
depth z, there is a given depth where the interface between the deterministic 
regime and the random regime is located. 

Figure 2 displays workloads obtained in a (l+l)-dimensional lattice. On 
this figure the snakes extends from the initial load of 20 to the load of 7 
followed by a small blob of height 3. Parameters for this simulation were 
7 = 0.3, /5 = 0.3. 

A mean field theory, proposed in a different context by Weisbuch etal[2] 
and Nadal etal[5] , predicts a transition between the head and tails regime 
at a depth z such that: 

(3 * (q(z) - 1) , 

^ ' = 2d+l (4) 

7 

For larger values of q{z), all the work load is transfered to a single neigh- 
bour with a preference coefficient of {q{z) — l)/7, and all other coefficients 
are 0. For lower values of q{z), all preference coefficients are small, with 
possible fluctuations around the interface. These predictions are verified in 
figure 3 computed for simulations in 1+1, 1+2 and 1+3 dimensions. 

Figure 4 is a more systematic test in 1+1 dimension of this dynamics. We 
here plotted in the mean square distance of the positions of the workers in 
each hyper-plane = line from the position of the highest worker concentration 
(more precisely: from the center of mass of their distribution). In the tail 
this squared width is exactly zero (left part), while in the blob (right part 
for each curve) it has a peak. The peak position shifts from small depths 
(close top plane, no snake) to large depths (close to bottom plane at 60, long 
snake), when beta increases. Similar plots of the snake lengths were obtained 
for d = 2 and 3 (not shown); also one test for d = 4, L = 29 displays an 
interface between zero and positive width. 

We plot in Fig. 5 the average (over ten samples of 10,000 iterations each) 
of the position of the lowest hyper-plane touched by the work distribution 
process as a function of (3. Although the statistics obtained are a clear 
indication that the depth of the system follows the same trend from 1+1 to 
1+3 dimensions they are not directly interpreted. The measured depth is 
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in fact the sum of the length of the snake part plus the height of the blob 
part. Both parts vary with j3. The snake length is obtained from equation 
(3) since q{z) is simply Q — z. But the blob height depends upon the charge 
at the interface and we don't have any simple analytic expression for it. 

Figure 6 shows the total number Nt of sites which were involved in at 
least one of the 10,000 iterations, i.e. the total work force with long-time 
employment, and also at the fluctuations Nf in the work force from one 
iteration to the next. (Thus Nj is the number of sites which are used at 
iteration t and not at time t — 1, or the other way around.) We see that for 
large j3 the fluctuations are diminished (the same snake tail again and again 
passes on the work), but this decrease is accompanied by an increase in iV^, 
an effect which helps the labour market but not the company. 

In the above version, also the managers who distributed work to their 
lower neighbours took over one work unit each for themselves. If instead they 
give on all the workload given to them (provided it was at least two units), 
then each iteration requires not only Q people as above, but Q workers plus 
a fluctuating number of managers. Moreover, if the snake hits the bottom 
line at depth = Q, part of the work is never finished. This is hardly an 
efficient way to run a business, but Figure 7 and 8 show a much sharper 
transition, from a localised cluster at low j3 to delocalized snake tails at high 
p. 

4 Conclusions 

The simple reinforcement learning presented here does end-up in a metastable 
path in the worker space represented here by the snake + blob picture, which 
we interpret as a firm. On the other hand we would rather imagine firms as 
hierarchical structures such as trees [U [5] . Because of the blob-snake sharp 
transition as a function of z, we never observe a well balanced tree with a 
selection at each node of several preferred collaborators, but rather either a 
nearly complete preference for one neighbour or roughly equal preference for 
all. 

In conclusion, the present model explains the metastability of employment 
relations in the firm, but something has to be added to it to explain the more 
efficient workload repartition observed in real firms. 

The present manuscript was written during GW and DS stay at the 
physics department of the Hebrew University in Jerusalem which we thank 
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for its hospitality. It was supported by GIACS, a Coordination Action of the 
European communities. 
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Flow (up), depth (down), d=1, L=29 




10 



1000 



10 100 1000 10000 

iterations 

Flow (up), depth (down), d=4, L=29 



Q. 
CD 
T3 

O 



100 



10 




10 



100 
iterations 



1000 



10000 



Figure 1: Top: Equilibration of the sum of all preference coefficients (top 
curve), and depth at j3 = 0.1, 7 = 0.3, 1 + Idimensions, Q = 60, averaged 
over 10,000 iterations. Bottom: Same parameters except for d = 4. 
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Figure 2: One instance of the work load repartition from an initial load of 
20 at the top site until the lower hne. The load is initially distributed with a 
strong preference for one neighbour out of three and is then more uniformly 
from load 7. (3^ 0.3, 7 - 0.3, Q = 20. 
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Snake tails for beta = gamma = 0.03, Q=20 
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Figure 3: Evolution in l + d dimensions of the preference coefficient with dis- 
tance from the "surface" . For smaller depths, in the tail region, the preference 
coefficients are either strong {{q{z) — and independent of the dimension 
d, or zero. A transition is observed around resp. charges of 5, 4 and 3 rather 
than for 2d+ 1 — 7,5 and 3 resp. as predicted by the mean field theory (equa- 
tion (3). Top part: Overall picture emphasising the snake. Bottom part: En- 
largement of blob region. Q = 20, /? = 0.03, 7 = 0.03, (i = 1(+), 2(x), 3(*). 
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beta=.01, .03, .05, .1, .2, .4 left to right 
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Figure 4: Variation with vertical position of the square width of the working 
region within a horizontal hyper-plane (= line), averaged over the last 5,000 
of 10,000 iterations. 7 = 0.3, d — 1,Q — 60 Similar results were obtained 
ior d — 2 and 3. 



Ten 60*30'^d samples d=1 (top), 2(middle), 3(bottom) 
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Figure 5: Variation with /? of the vertical position of the lowest working plane 
for several dimensions (1+1,1+2,1-1-3). Enlargement to L = 300 or — 600 
gives no significant changes. 
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Total(up), fluctuation(down); d=1 ,2,3 bottom to top 
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Figure 6: Variation with /3 of the total number of different people who worked 

during at least one of the 10,000 iteration (three top curves), and of the 
fluctuations in the work force (three bottom curves); L = 30, — 60, one 
sample. Date are averages over the last 1000 iterations. 
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Non-working managers: Q=Lz = 60(+), 50(x), 40(*) 
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Figure 7: Results for the non-working managers model; ten samples for 
10,000 iterations each, d — 1, L — 30, Q — increases from right to left. 
Top: Average position of lowest working plane. Middle: Average fraction of 
failures where process hits the lattice bottom at L^. Bottom: j3 value ac- 
cording to middle part where the failure fraction reaches 50 percent.7 = 0.3. 
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Scaling in modified form, d=1 , L=40(+) to 400(tr.) 
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Figure 8: Scaling plot of figure 7, top, with approximate collapse for different 
Q = Lz between 40 to 400. 
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